Implicitly-Parallel Functional Dataflow for Productive Cloud Programming on Chameleon
نویسندگان
چکیده
One solution that makes parallel programming implicit rather than explicit is the dataflow model. Conceived ~35 years ago, it has only recently been made practical through systems such as Dryad and Swift [1]. We believe that we have successfully created a base for an implicitlyparallel functional dataflow programming model, as exemplified by Swift, a workflow language for executing scientific applications. This model has been characterized as a perfect fit for the many-task computing (MTC) paradigm. Some broad application classes that fit the MTC paradigm are workflows, MapReduce, high-throughput computing, and a subset of high-performance computing. MTC emphasizes using many computing resources over short periods of time to accomplish many smaller computational tasks (both dependent and independent), where the primary metrics are measured in seconds. MTC has proven successful in grid computing and supercomputing, but the distributed nature of today’s cloud resources pose many challenges in the efficient support of MTC workloads. This work aims to address the programmability gap between MTC and cloud computing, through an innovative parallel scripting language, Swift, which will enable MTC workloads to efficiently leverage cloud resources. This work will enable a broader class of MTC applications to leverage cloud systems.
منابع مشابه
Productive composition of extreme-scale applications using implicitly parallel dataflow
In every decade since the 1970’s, computer scientists have re-examined dataflow-based execution models, hoping the programming productivity benefits these models promise can be realized on practical hardware platforms to implement useful applications. Based on the recent Swift/T implementation of “implicitly parallel functional dataflow” (IPFD) for extremescale systems, we believe that the data...
متن کاملCloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming
The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...
متن کاملParallel Triangles Counting Using Pipelining
The generalized method to have a parallel solution to a computational problem, is to find a way to use Divide & Conquer paradigm in order to have processors acting on its own data and therefore all can be scheduled in parallel. MapReduce is an example of this approach: Input data is transformed by the mappers, in order to feed the reducers that can run in parallel. In general this schema gives ...
متن کاملA Comparison of Implicitly Parallel Multithreaded and Data-Parallel Implementations of an Ocean Model
Two parallel implementations of a state-of-the-art ocean model are described and analyzed: one is written in the implicitly parallel language Id for the Monsoon multithreaded dataflow architecture, and the other in data-parallel CM Fortran for the CM-5. The multithreaded programming model is inherently more expressive than the data-parallel model but is not especially adapted to regular data st...
متن کاملPerformance tuning scientific codes for dataflow execution
Performance tuning programs for dataflow execution involves tradeoffs and optimizations which may be significantly different than for execution on conventional machines. We examine some tuning techniques for scientific programs with regular control but irregular geometry. We use as an example the core of an ocean modeling code developed in the implicitly parallel language Id for the Monsoon dat...
متن کامل